Improving Distantly Supervised Extraction of Drug-Drug and Protein-Protein Interactions

نویسندگان

  • Tamara Bobic
  • Roman Klinger
  • Philippe Thomas
  • Martin Hofmann-Apitius
چکیده

Relation extraction is frequently and successfully addressed by machine learning methods. The downside of this approach is the need for annotated training data, typically generated in tedious manual, cost intensive work. Distantly supervised approaches make use of weakly annotated data, like automatically annotated corpora. Recent work in the biomedical domain has applied distant supervision for proteinprotein interaction (PPI) with reasonable results making use of the IntAct database. Such data is typically noisy and heuristics to filter the data are commonly applied. We propose a constraint to increase the quality of data used for training based on the assumption that no self-interaction of realworld objects are described in sentences. In addition, we make use of the University of Kansas Proteomics Service (KUPS) database. These two steps show an increase of 7 percentage points (pp) for the PPI corpus AIMed. We demonstrate the broad applicability of our approach by using the same workflow for the analysis of drug-drug interactions, utilizing relationships available from the drug database DrugBank. We achieve 37.31 % in F1 measure without manually annotated training data on an independent test set.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Weakly Labeled Corpora as Silver Standard for Drug-Drug and Protein-Protein Interaction

Institute for Computer Science Humboldt-Universität zu Berlin Unter den Linden 6 10099 Berlin Germany Fraunhofer Institute for Algorithms and Scientific Computing (SCAI) Schloss Birlinghoven 53754 Sankt Augustin Germany Bonn-Aachen Center for Information Technology (B-IT) Dahlmannstraße 2 53113 Bonn Germany {tbobic,klinger,hofmann-apitius}@scai.fraunhofer.de {thomas,leser}@informatik.hu-berlin....

متن کامل

In vitro study of drug-protein interaction using electronic absorption, fluorescence, and circular dichroism spectroscopy

In the near future, design of a new generation of drugs targeting proteins will be required. Considering the complex bond between the drug and protein, the structure and stability of the target protein should be considered. So far, a series of in vitro investigations have been conducted with the aim of predicting drug-biological medium interactions. In these studies, use of spectroscopic method...

متن کامل

INVESTIGATIONS ON THE DRUG-PROTEIN IN TERAC TION OF CERTAIN NEW POTENTIAL LOCAL ANAESTHETICS

Generally, plasma proteins owe their binding capacity to the presence of aminoacid units which enter into intra- and intermolecular hydrophobic bonding with a diverse range of endo- and exogenous chemical substances. The intermolecular interactions between the hydrophobic areas of drug molecules and those of plasma proteins play an important role in drug-macromolecular complex formation and...

متن کامل

Pharmaceutical Advances and Proteomics Researches

Proteomics enables understanding the composition, structure, function and interactions of the entire protein complement of a cell, a tissue, or an organism under exactly defined conditions. Some factors such as stress or drug effects will change the protein pattern and cause the present or absence of a protein or gradual variation in abundances. Changes in the proteome provide a snapshot of the...

متن کامل

Pharmaceutical Advances and Proteomics Researches

Proteomics enables understanding the composition, structure, function and interactions of the entire protein complement of a cell, a tissue, or an organism under exactly defined conditions. Some factors such as stress or drug effects will change the protein pattern and cause the present or absence of a protein or gradual variation in abundances. Changes in the proteome provide a snapshot of the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012